-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kvserver: get rid of bespoke relocation logic used by the mergeQueue #62631
kvserver: get rid of bespoke relocation logic used by the mergeQueue #62631
Conversation
I'm going to add a randomized test (that just issues a bunch of random calls to
|
Release note: None
3ea0d8d
to
d0be2f0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very satisfying!
Reviewed 1 of 1 files at r1, 4 of 4 files at r2, 4 of 4 files at r3.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @aayushshah15 and @andreimatei)
pkg/kv/kvserver/allocator.go, line 193 at r2 (raw file):
// targetReplicaType.{Add,Remove}ChangeType methods wherever possible. func (t targetReplicaType) AddChangeType() roachpb.ReplicaChangeType { switch typ := t; typ {
Is there a reason for the value assignment in these switch statements? Would switch t {
accomplish the same thing with less code and less room for confusion?
pkg/kv/kvserver/allocator.go, line 205 at r2 (raw file):
// RemoveChangeType returns the roachpb.ReplicaChangeType corresponding to the // given targetReplicaType. func (t targetReplicaType) RemoveChangeType() roachpb.ReplicaChangeType {
Can we use this in replicateQueue.removeDead
?
pkg/kv/kvserver/client_merge_test.go, line 21 at r2 (raw file):
"reflect" "regexp" "sort"
I'm surprised to see an import change with no other code changes. Does this commit compile?
pkg/kv/kvserver/replica_command.go, line 2469 at r2 (raw file):
} // GetTargetsToCollocateRHSForMerge decides the configuration of RHS replicas
💯
pkg/kv/kvserver/replica_command.go, line 2875 at r2 (raw file):
StoreID: targetStore.StoreID, } if args.targetType == voterTarget {
Was there a reason to separate this block out from the next? Do you find it easier to read?
If we merged them, we could avoid scanning through the replicas twice, and could instead of a single iteration through existingNonVoters
.
pkg/kv/kvserver/replica_command.go, line 3001 at r2 (raw file):
// (Note that !shouldRemove implies that we're trying to remove the last // replica left in the descriptor which is illegal). ops = append(ops, roachpb.MakeReplicationChanges(args.targetType.RemoveChangeType(), removalTarget)...)
Why the append
here?
pkg/kv/kvserver/replicate_queue.go, line 1130 at r2 (raw file):
return false, err } if performingSwap {
nit: consider pulling this below rq.metrics.RebalanceReplicaCount.Inc(1)
so that the non-conditional metric comes first and the conditional metrics come after.
pkg/kv/kvserver/replicate_queue.go, line 1221 at r2 (raw file):
} log.VEventf(ctx, 1, "can't swap replica due to lease; falling back to add") return chgs, performingSwap, err
nit: we generally prefer returning a constant instead of a named return variable when the value is known. It makes things easier to read. So this would be return chgs, false, err
.
d0be2f0
to
9828460
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I cleaned some things up and added a bunch more testing -- all in the second commit. Should be RFAL.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @aayushshah15, @andreimatei, and @nvanbenschoten)
pkg/kv/kvserver/allocator.go, line 193 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Is there a reason for the value assignment in these switch statements? Would
switch t {
accomplish the same thing with less code and less room for confusion?
No reason. Fixed now.
pkg/kv/kvserver/allocator.go, line 205 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Can we use this in
replicateQueue.removeDead
?
Done there and in removeDecommissioning
.
pkg/kv/kvserver/client_merge_test.go, line 21 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
I'm surprised to see an import change with no other code changes. Does this commit compile?
Messed up while re-organizing my changes. Fixed now.
pkg/kv/kvserver/replica_command.go, line 2875 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Was there a reason to separate this block out from the next? Do you find it easier to read?
If we merged them, we could avoid scanning through the replicas twice, and could instead of a single iteration through
existingNonVoters
.
That wasn't intentional, fixed.
pkg/kv/kvserver/replica_command.go, line 3001 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Why the
append
here?
Fixed.
pkg/kv/kvserver/replicate_queue.go, line 1130 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
nit: consider pulling this below
rq.metrics.RebalanceReplicaCount.Inc(1)
so that the non-conditional metric comes first and the conditional metrics come after.
Done.
pkg/kv/kvserver/replicate_queue.go, line 1221 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
nit: we generally prefer returning a constant instead of a named return variable when the value is known. It makes things easier to read. So this would be
return chgs, false, err
.
Done.
3b59a24
to
b4aeeda
Compare
The new randomized test I added is flaking because it also asserts that |
94f2dba
to
74b6ec9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 6 of 7 files at r4, 7 of 7 files at r5.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @aayushshah15 and @andreimatei)
pkg/kv/kvserver/replica_command.go, line 2875 at r2 (raw file):
Previously, aayushshah15 (Aayush Shah) wrote…
That wasn't intentional, fixed.
We're still searching for the non-voter twice, once with GetReplicaDescriptor
and once with an iteration over existingNonVoters
. Should we avoid that?
pkg/kv/kvserver/replica_command.go, line 2957 at r5 (raw file):
} } else if shouldAdd { ops = roachpb.MakeReplicationChanges(args.targetType.AddChangeType(), additionTarget)
It's probably better to avoid setting ops
if we're just going to immediately overwrite it on the canPromoteNonVoter
/canDemoteVoter
paths. This is currently a bit misleading and looks error-prone.
74b6ec9
to
db73198
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @andreimatei and @nvanbenschoten)
pkg/kv/kvserver/replica_command.go, line 2875 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
We're still searching for the non-voter twice, once with
GetReplicaDescriptor
and once with an iteration overexistingNonVoters
. Should we avoid that?
Ugh, sorry for being careless!
Fixed for real now.
pkg/kv/kvserver/replica_command.go, line 2957 at r5 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
It's probably better to avoid setting
ops
if we're just going to immediately overwrite it on thecanPromoteNonVoter
/canDemoteVoter
paths. This is currently a bit misleading and looks error-prone.
Done (this is what you meant, right?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 5 of 5 files at r7.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @aayushshah15 and @andreimatei)
pkg/kv/kvserver/replica_command.go, line 2957 at r5 (raw file):
Previously, aayushshah15 (Aayush Shah) wrote…
Done (this is what you meant, right?).
Exactly.
pkg/kv/kvserver/replica_command.go, line 2847 at r7 (raw file):
} } if promotionTargetIdx >= 0 {
Do we need promotionTargetIdx
? Can we just move this condition into the loop?
voters to non-voters This commit updates `AdminRelocateRange` to use explicit atomic swaps of voting replicas with non-voting replicas, that cockroachdb#58627 initially added support for. The patch does so by generalizing behavior that's already exercised by the `replicateQueue` when it decides to rebalance replicas. See cockroachdb#61239. This allows us, in the next commit, to remove bespoke relocation logic that's used by the `mergeQueue` to align replica sets for the sake of a range merge. Release note: None
This commit removes the relocation logic used by the `mergeQueue` thus far to align replica sets (added in cockroachdb#56197). This logic previously existed in order to allow us to align the replica sets of a pair of ranges (which is required for the range merge to proceed), while avoiding redundant data movement. Before cockroachdb#58627 and the previous commit in this PR, `AdminRelocateRange` couldn't be directly used by the mergeQueue under various configurations of the LHS and RHS ranges. Furthermore, even when it could be used, it would involve redundant data movement. This all required us to compute relocation targets for the RHS of a merge separately, above the call to `AdminRelocateRange`, for the range merge to proceed. All these limitations have been resolved by the previous commit which teaches `AdminRelocateRange` to promote non-voters and demote voters when needed, and the aforementioned bespoke relocation logic is no longer needed. Release note: None
db73198
to
d7a7a49
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for being patient with me for these last few comments, and thanks for the review.
bors r+
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @andreimatei and @nvanbenschoten)
pkg/kv/kvserver/replica_command.go, line 2847 at r7 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Do we need
promotionTargetIdx
? Can we just move this condition into the loop?
Moved it into the loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 5 of 5 files at r9.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @andreimatei)
Build succeeded: |
This PR contains two main commits:
kvserver: update
AdminRelocateRange
to leverage explicit swaps ofvoters to non-voters
This commit updates
AdminRelocateRange
to use explicit atomic swaps ofvoting replicas with non-voting replicas, that #58627 initially added
support for. The patch does so by generalizing behavior that's already
exercised by the
replicateQueue
when it decides to rebalance replicas.See #61239.
This allows us, in the next commit, to remove bespoke relocation logic
that's used by the
mergeQueue
to align replica sets for the sake of arange merge.
Release note: None
kvserver: get rid of bespoke relocation logic used by the mergeQueue
This commit removes the relocation logic used by the
mergeQueue
thusfar to align replica sets (added in #56197). This logic previously
existed in order to allow us to align the replica sets of a pair of
ranges (which is required for the range merge to proceed), while
avoiding redundant data movement.
Before #58627 and the previous commit in this PR,
AdminRelocateRange
couldn't be directly used by the mergeQueue under various configurations
of the LHS and RHS ranges. Furthermore, even when it could be used, it
would involve redundant data movement. This all required us to compute
relocation targets for the RHS of a merge separately, above the call to
AdminRelocateRange
, for the range merge to proceed.All these limitations have been resolved by the previous commit which
teaches
AdminRelocateRange
to promote non-voters and demote voterswhen needed, and the aforementioned bespoke relocation logic is no
longer needed.
Resolves #62370
Release note: None